Routine clinical visits of a patient produce not only image data, but also non-image data containing clinical information regarding the patient, i.e., medical data is multi-modal in nature. Such heterogeneous modalities offer different and complementary perspectives on the same patient, resulting in more accurate clinical decisions when they are properly combined. However, despite its significance, how to effectively fuse the multi-modal medical data into a unified framework has received relatively little attention. In this paper, we propose an effective graph-based framework called HetMed (Heterogeneous Graph Learning for Multi-modal Medical Data Analysis) for fusing the multi-modal medical data. Specifically, we construct a multiplex network that incorporates multiple types of non-image features of patients to capture the complex relationship between patients in a systematic way, which leads to more accurate clinical decisions. Extensive experiments on various real-world datasets demonstrate the superiority and practicality of HetMed. The source code for HetMed is available at https://github.com/Sein-Kim/Multimodal-Medical.
translated by 谷歌翻译
显着对象检测(SOD)最近引起了人们的关注,但对高分辨率(HR)图像的研究较少。不幸的是,与低分辨率(LR)图像和注释相比,HR图像及其像素级注释肯定是更耗费劳动力和耗时的。因此,我们建议没有任何HR数据集的HR预测,建议基于图像金字塔的SOD框架,逆显着性金字塔重建网络(INSPYRENET)。我们设计了Inspyrenet,以产生严格的图像金字塔结构,使其能够将多个结果与基于金字塔的图像混合在一起。为了进行HR预测,我们设计了一种金字塔混合方法,该方法从同一图像中从一对LR和HR量表中合成了两个不同的图像金字塔,以克服有效的接受场(ERF)差异。我们对公共LR和HR SOD基准的广泛评估表明,Inspyrenet超过了各种SOD指标和边界准确性的最新方法(SOTA)方法。
translated by 谷歌翻译
预测交通状况非常具有挑战性,因为每条道路在空间和时间上都高度依赖。最近,为了捕获这种空间和时间依赖性,已经引入了专门设计的架构,例如图形卷积网络和时间卷积网络。尽管流量预测取得了显着进展,但我们发现基于深度学习的流量预测模型仍然在某些模式中失败,主要是在事件情况下(例如,快速速度下降)。尽管通常认为这些故障是由于不可预测的噪声造成的,但我们发现可以通过考虑以前的失败来纠正这些故障。具体而言,我们观察到这些失败中的自相关错误,这表明仍然存在一些可预测的信息。在这项研究中,为了捕获错误的相关性,我们引入了Rescal,Rescal是流量预测的剩余估计模块,作为广泛适用的附加模块,用于现有的流量预测模型。我们的恢复通过使用以前的错误和图形信号来估算未来错误,从而实时校准现有模型的预测。对METR-LA和PEMS-BAY进行的广泛实验表明,我们的恢复可以正确捕获错误的相关性,并在事件情况下纠正各种流量预测模型的故障。
translated by 谷歌翻译
深度学习在大量大数据的帮助下取得了众多域中的显着成功。然而,由于许多真实情景中缺乏高质量标签,数据标签的质量是一个问题。由于嘈杂的标签严重降低了深度神经网络的泛化表现,从嘈杂的标签(强大的培训)学习是在现代深度学习应用中成为一项重要任务。在本调查中,我们首先从监督的学习角度描述了与标签噪声学习的问题。接下来,我们提供62项最先进的培训方法的全面审查,所有这些培训方法都按照其方法论差异分为五个群体,其次是用于评估其优越性的六种性质的系统比较。随后,我们对噪声速率估计进行深入分析,并总结了通常使用的评估方法,包括公共噪声数据集和评估度量。最后,我们提出了几个有前途的研究方向,可以作为未来研究的指导。所有内容将在https://github.com/songhwanjun/awesome-noisy-labels提供。
translated by 谷歌翻译
Sentence summarization shortens given texts while maintaining core contents of the texts. Unsupervised approaches have been studied to summarize texts without human-written summaries. However, recent unsupervised models are extractive, which remove words from texts and thus they are less flexible than abstractive summarization. In this work, we devise an abstractive model based on reinforcement learning without ground-truth summaries. We formulate the unsupervised summarization based on the Markov decision process with rewards representing the summary quality. To further enhance the summary quality, we develop a multi-summary learning mechanism that generates multiple summaries with varying lengths for a given text, while making the summaries mutually enhance each other. Experimental results show that the proposed model substantially outperforms both abstractive and extractive models, yet frequently generating new words not contained in input texts.
translated by 谷歌翻译
顺序推荐系统通过捕获用户的兴趣漂移来显示有效的建议。有两组现有的顺序模型:以用户和项目为中心的模型。以用户为中心的模型根据每个用户的顺序消费历史记录来捕获个性化的利息漂移,但没有明确考虑用户对项目的利益是否超出培训时间,即利息可持续性。另一方面,以项目为中心的模型考虑了用户在培训时间后的一般利益是否维持,但不是个性化的。在这项工作中,我们提出了一个推荐系统,将两类模型的优势占据优势。我们提出的模型捕获了个性化的利息可持续性,表明每个用户对物品的利益是否会超出培训时间。我们首先制定一项任务,该任务需要根据用户的消费历史记录预测培训时间中每个用户将消耗哪些项目。然后,我们提出简单而有效的方案,以增强用户的稀疏消费历史记录。广泛的实验表明,所提出的模型在11个现实世界数据集上的表现优于10个基线模型。这些代码可在https://github.com/dmhyun/peris上找到。
translated by 谷歌翻译
在过去的几年中,图表学习(GRL)是分析图形结构数据的有力策略。最近,GRL方法通过采用用于图像的学习表示形式而开发的自我监督学习方法来显示出令人鼓舞的结果。尽管它们成功了,但现有的GRL方法倾向于忽略图像和图形之间的固有区别,即,假定图像是独立和相同分布的,而图表在数据实例之间显示了关系信息,即节点。为了完全受益于图形结构数据中固有的关系信息,我们提出了一种名为RGRL的新颖GRL方法,该方法从图形本身生成的关系信息中学习。 RGRL学习节点表示形式,使节点之间的关系是增强的不变性,即增强不变的关系,只要保留节点之间的关系,就可以改变节点表示。通过在全球和本地观点中考虑节点之间的关系,RGRL克服了对对比和非对抗性方法的局限性,并实现了两者中最好的。在各种下游任务上对十四个基准数据集进行了广泛的实验,证明了RGRL优于最先进的基线。 RGRL的源代码可在https://github.com/namkyeong/rgrl上获得。
translated by 谷歌翻译
识别异常文档,其内容与语料库中的大多数文档不同,在管理大型文本集合中发挥了重要作用。但是,由于没有关于Inlier(或目标)分布的明确信息,现有的无监督异常探测器可能会根据语料库中的异常值的密度或多样性进行不可靠的结果。为了解决这一挑战,我们介绍了一项新的任务,称为类别无类别检测,该任务旨在通过使用类别名称作为弱监管来将文档与Inlier(或目标)类别的语义相关。在实践中,该任务可以广泛适用于,它可以灵活地根据用户的兴趣指定目标类别的范围,同时仅需要目标类别名称作为最小指导。在本文中,我们介绍了一个类别超类的检测框架,它有效地根据其特定于类别的相关性得分,有效地测量每个文档的一个目标类别之一。我们的框架采用两步方法; (i)它首先通过利用在文本嵌入空间中编码的单词文件相似度,然后(ii)通过使用伪标签来计算伪标签以计算置信度来生成所有未标记的文档的伪类别标签从其目标类别预测。真实世界数据集的实验表明,我们的框架在指定不同目标类别的各种场景中的所有基线方法中实现了最佳检测性能。
translated by 谷歌翻译
The 3D-aware image synthesis focuses on conserving spatial consistency besides generating high-resolution images with fine details. Recently, Neural Radiance Field (NeRF) has been introduced for synthesizing novel views with low computational cost and superior performance. While several works investigate a generative NeRF and show remarkable achievement, they cannot handle conditional and continuous feature manipulation in the generation procedure. In this work, we introduce a novel model, called Class-Continuous Conditional Generative NeRF ($\text{C}^{3}$G-NeRF), which can synthesize conditionally manipulated photorealistic 3D-consistent images by projecting conditional features to the generator and the discriminator. The proposed $\text{C}^{3}$G-NeRF is evaluated with three image datasets, AFHQ, CelebA, and Cars. As a result, our model shows strong 3D-consistency with fine details and smooth interpolation in conditional feature manipulation. For instance, $\text{C}^{3}$G-NeRF exhibits a Fr\'echet Inception Distance (FID) of 7.64 in 3D-aware face image synthesis with a $\text{128}^{2}$ resolution. Additionally, we provide FIDs of generated 3D-aware images of each class of the datasets as it is possible to synthesize class-conditional images with $\text{C}^{3}$G-NeRF.
translated by 谷歌翻译
In both terrestrial and marine ecology, physical tagging is a frequently used method to study population dynamics and behavior. However, such tagging techniques are increasingly being replaced by individual re-identification using image analysis. This paper introduces a contrastive learning-based model for identifying individuals. The model uses the first parts of the Inception v3 network, supported by a projection head, and we use contrastive learning to find similar or dissimilar image pairs from a collection of uniform photographs. We apply this technique for corkwing wrasse, Symphodus melops, an ecologically and commercially important fish species. Photos are taken during repeated catches of the same individuals from a wild population, where the intervals between individual sightings might range from a few days to several years. Our model achieves a one-shot accuracy of 0.35, a 5-shot accuracy of 0.56, and a 100-shot accuracy of 0.88, on our dataset.
translated by 谷歌翻译